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ABSTRACT 

We use a sample of ~ 200, 000 galaxies drawn from the Sloan Digital Sky Survey 
(SDSS) with 0.01 < z < 0.3 and —23 < Mo.i r < —16 to study how clustering depends 
CO ■ on properties such as stellar mass (M*), colour (g — r), 4000 A break strength (D4000), 

concentration index (C), and stellar surface mass density (//*). Our measurements of 
w p (r p ) as a function of r-band luminosity are in excellent agreement with previous 
two-degree Field Galaxy Redshift Survey and SDSS analyses. We compute w p (r p ) as 
a function of stellar mass and we find that more massive galaxies cluster more strongly 
than less massive galaxies, with the difference increasing above the characteristic stel- 
jy^ I lar mass M* of the Schechter mass function. We then divide our sample according to 

f*^ , colour, 4000 A break strength, concentration and surface density. As expected, galaxies 

with redder colours, larger 4000A break strengths, higher concentrations and larger 
surface mass densities cluster more strongly. The clustering differences are largest on 
small scales and for low mass galaxies. At fixed stellar mass, the dependences of clus- 
O ' tering on colour and 4000 A break strength are similar. Different results are obtained 

\ when galaxies are split by concentration or surface density. The dependence of w p (r p ) 

on g — r and D4000 extends out to physical scales that are significantly larger than those 
of individual dark matter haloes (> 5/i -1 Mpc). This large-scale clustering dependence 
is not seen for the parameters C or On small scales (< l/i -1 Mpc), the amplitude 
of the correlation function is constant for "young" galaxies with 1.1 <D4ooo < 1.5 
and a steeply rising function of age for "older" galaxies with D4000 > 1.5. In contrast, 
^ . the dependence of the amplitude of w p (r p ) on concentration on scales less than lh^ 1 

Mpc is strongest for disk-dominated galaxies with C < 2.6. This demonstrates that 
different processes are required to explain environmental trends in the structure and 
in the star formation history of galaxies. 

Key words: galaxies: clusters: general-galaxies: distances and redshifts - cosmology: 
theory - dark matter - large-scale structure of Universe 



1 INTRODUCTION 

Our understanding of the large-scale structure of the Uni- 
verse has come primarily from studies of redshift surveys of 
nearby galaxies. The two-point correlation function (2PCF) 
of galaxies has long served as the primary way of quanti- 
fying the clustering properties of galaxies in these surveys 
(for example, Peebles 1980). As the fundamental lowest or- 
der statistic, the 2PCF is simple to compute and provides a 
full statistical description for Gaussian fields. It can also be 
easily compared with the predictions of theoretical models. 
Such comparisons have led to the conclusion that the obser- 
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vations are not consistent with the predictions of the stan- 
dard ACDM "concordance" model unless there is a scale- 
dependent bias in the distribution of galaxies relative to the 
dark matter (Jing, Mo & Borner 1998; Jenkins et al. 1998; 
Gross et al. 1998). 

Benson et al. (2000a) clarified how the dependence of 
galaxy formation efficiency on halo mass could lead to just 
such a scale-dependent bias. On large scales, the bias in the 
galaxy distribution is related in a simple way to the bias in 
the distribution of dark haloes. On small scales, the ampli- 
tude and slope of the correlation function is determined by 
the interplay of a number of different effects, including the 
distribution of the number of galaxies that occupy a halo 
of given mass and the fact that the brightest galaxy in each 
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halo is always located near the halo centre. These ideas have 
been further developed into the so-called "halo occupation 
distribution" (HOD) approach by many different authors 
(for example Jing, Mo & Borner 1998, Seljak 2000; Peacock 
& Smith 2000; Berlind & Weinberg 2002; Cooray & Sheth 
2002; Yang, Mo & van den Bosch 2003). 

The HOD approach enables one to understand why the 
correlation function of L* galaxies is close to a power law 
over nearly four orders of magnitude in amplitude in a flat, 
fio = 0.3 CDM universe. However, an important corollary 
is that the clustering properties of galaxies ought to depend 
strongly on galaxy colour, star formation rate and morphol- 
ogy, because the halo occupation distributions of galaxies 
are predicted to depend sensitively on these properties (see 
for example Kauffmann, Nusser & Steinmetz 1997; Kauff- 
mann et al. 1999; Benson et al. 2000b). 

The fact that the measured correlations of galaxies 
differ according to type has been known for almost three 
decades. Davis & Geller (1976) computed angular corre- 
lations for galaxies in the Uppsala Catalog and showed 
that elliptical-elliptical correlations were characterized by a 
power law with steeper slope than spiral-spiral correlations. 
Dressier (1980) quantified this as a relation between galaxy 
type and local galaxy density, with an increasing elliptical 
and SO population and a corresponding decrease in spirals 
in the densest environments. 

The large redshift surveys assembled in recent years, 
e.g. 2dFGRS and SDSS, have provided angular positions 
and redshifts for samples of hundreds of thousands of galax- 
ies and have allowed the dependence of clustering on galaxy 
properties to be studied with unprecedented accuracy. These 
studies have established that the clustering of galaxies in the 
local Universe depends on a variety of factors, including lu- 
minosity (Norberg et al. 2001, Zehavi et al. 2002, Zehavi et 
al. 2005), colour (Zehavi et al. 2002, Zehavi et al. 2005), con- 
centration (Zehavi et al. 2002, Goto et al. 2003), and spec- 
tral type (Norberg et al. 2002, Budavari et al. 2003, Madg- 
wick et al. 2003). These studies have revealed that galaxies 
with red colours, bulge-dominated morphologies and spec- 
tral types indicative of old stellar populations reside prefer- 
entially in dense regions (Zehavi et al. 2005 and references 
therein, hereafter Z05). Furthermore, luminous galaxies clus- 
ter more strongly than less luminous galaxies, with the lu- 
minosity dependence becoming more significant for galax- 
ies brighter than L* (the characteristic luminosity of the 
Schechter [1976] function). When galaxies are divided by 
colour, redder galaxies show a higher amplitude and steeper 
correlation function at all luminosities. 

In order to interpret these clustering dependencies in 
the framework of galaxy formation models, it is useful to 
express the clustering results in terms of physical quanti- 
ties such as galaxy mass, size and mean stellar age, instead 
of more traditional quantities such as luminosity or colour. 
Galaxy luminosity does not necessarily correlate very closely 
with stellar mass (the dominant baryonic component in all 
but the smallest galaxies). Both luminosity and colour arc 
subject to strong dependences on the fraction of young stars 
in the galaxy and on its dust content. These effects also 
complicate comparisons between the clustering of low red- 
shift and high redshift galaxies. It is now known that the 
star formation rates in galaxies evolve very strongly as a 
function of redshift. As a result, if one measures a change 
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Figure 1. The results of a computation that we carried out to 
see out to what redshift a given mass in stars with maximum 
possible M»/L would be detected. This assumes a 13 Gyr stellar 
population and Bruzual & Chariot models. 

in clustering amplitude at fixed luminosity, it is not simple 
to ascertain which part of the effect is caused by the evolu- 
tion in the stellar mass-to-light ratio (M*/L) and which part 
by a change in the halo occupation distributions at higher 
redshift. 

In this paper we study the dependence of galaxy clus- 
tering on both luminosity and stellar mass using a large 
sample of galaxies drawn from the Sloan Digital Sky Survey. 
We then probe the dependence on other physical parame- 
ters, including colour (g — r), 4000A break strength (D4000), 
concentration parameter (C)and stellar surface mass density 
(pi*). The first two quantities, i.e. g — r and D4000, are pa- 
rameters associated with the recent star formation history 
of the galaxy (D4000 is expected to be less sensitive to dust 
attenuation effects than colour), whereas the other two are 
related to galaxy structure. We first describe the observa- 
tional samples used for the analysis. In §3 we outline our 
method of measuring the 2PCF from large redshift surveys. 
The results are described in §4 and summarized in the final 
section. 

Throughout this paper, We assume a cosmological 
model with the density parameter flo — 0.3 and the cos- 
mological constant Ao = 0.7. To avoid the — 51og 10 h factor, 
the Hubble's constant h = 1, in units of 100 kms _1 Mpc _1 , 
is assumed throughout this paper when computing absolute 
magnitudes. In this paper, the quantities with a superscript 
asterisk are those at the characteristic luminosity /mass (e.g. 
characteristic luminosity L*), whereas the quantities with 
a subscript asterisk refer to quantities associated with the 
stars in a galaxy (e.g. stellar mass M»). 



2 OBSERVATIONAL SAMPLES 
2.1 NYU-VAGC 

The Sloan Digital Sky Survey (SDSS) is the most ambi- 
tious optical imaging and spectroscopic survey to date. The 
survey goals are to obtain photometry of a quarter of the 
sky and spectra of nearly one million objects. Imaging is 
obtained in the u, g, r, i, z bands (Fukugita et al. 1996; 
Smith et al. 2002; Ivezic et al. 2004) with a special pur- 
pose drift scan camera (Gunn et al. 1998) mounted on the 
SDSS 2.5 meter telescope at Apache Point Observatory. The 
imaging data are photometrically (Hogg et al. 2001) and as- 
trometrically (Pier et al. 2003) calibrated, and used to select 
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Figure 2. Shown in the upper panel are the contours of number 
density of galaxies in the plane of stellar mass vs luminosity. The 
black lines are for the data and the red are reconstructed using the 
Gaussian functions that best fit the stellar mass distribution in 
282 different luminosity intervals. The contour levels are increased 
by factors of 2 from the lowest (15 [0.2mag] _1 [0.2 log 10 M Q ] _1 ) to 
the highest (7680 [0.2mog] _1 [0.21og 10 Mq]" 1 ). The lower panel 
shows examples of the Gaussian distributions. Histograms show 
the data and solid lines the best-fits. No is the Gaussian height. 
The corresponding centers (blue points) and widths (errorbars) of 
the Gaussians are shown in the upper panel. 

stars, galaxies, and quasars for follow-up fibre spectroscopy. 
Spectroscopic fibres are assigned to objects on the sky us- 
ing an efficient tiling algorithm designed to optimize com- 
pleteness (Blanton et al. 2003b). The details of the survey 
strategy can be found in (York et al. 2000) and an overview 
of the data pipelines and products is provided in the Early 
Data Release paper (Stoughton et al. 2002). 

The large areal coverage and moderately deep survey 
limit (a mean redshift of ~ 0.1 for galaxies in the main spec- 
troscopic sample) make the SDSS ideal for studying large- 
scale structure and the characteristics of galaxy populations 
in the local Universe. The SDSS covers two regions on the 
sky, one in the northern Galactic cap (NGC) and another in 
the southern Galactic cap (SGC). In the SGC, three stripes 
are observed, one along the celestial equator and the other 
two north and south of the equator. The NGC lies mostly 
above Galactic latitude 30°, but its footprint is adjusted 
slightly to lie within the minimum of the Galactic extinc- 
tion contours (Schlegel, Finkbeiner, & Davis 1998), resulting 
in an elliptical survey region (York et al. 2000). Currently 
the survey in the NGC consists of two separate regions, one 
along the celestial equator (hereafter NGCE) and another 
off the equator (hereafter NGCO). 

In this paper we use the New York University Value 
Added Catalog (NYU-VAGC) 1 , which is a catalog of lo- 
cal galaxies (mostly below z ~ 0.3) constructed by Blan- 
ton et al. (2005a) based on the SDSS Data Release Two 
(DR2, Abazajian et al. 2004). Earlier proprietary versions 

1 http:/ /wassup. physics, nyu.edu/vagc/ 
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of this catalog have formed the basis of many SDSS inves- 
tigations of the power spectrum, correlation function, and 
luminosity function of galaxies. The current version of the 
NYU-VAGC consists of 693,319 photometric objects (3514 
deg 2 ); 343,568 of these have redshift determinations (2627 
deg 2 ), with about 85% completeness. This small subset of 
the full SDSS catalog contains all of the information nec- 
essary for analyzing the SDSS spectroscopic survey at the 
catalog level. Compared with the catalogs distributed by 
the SDSS DR2 Archive Servers, the NYU-VAGC is pho- 
tometrically calibrated in a more consistent way, reducing 
systematic calibration errors across the sky from ~ 2% to 
about ~ 1%. It is therefore more appropriate for statistical 
studies of galaxy peoperties, galaxy clustering, and galaxy 
evolution. The NYU-VAGC is described in detail in Blanton 
et al. (2005a). 

2.2 Physical quantities 

The rich stellar absorption-line spectrum of a typical SDSS 
galaxy provides unique information about its stellar content 
and dynamics. Kauffmann et al. (2003a) presented a method 
for using this information to estimate the stellar masses of 
galaxies. The amplitude of the 4000 A break (the narrow 
version of the index denned in Balogh et al. 1999) and the 
strength of the H<5 absorption line (the Lick H8a index of 
Worthey & Ottaviani 1997) were used as diagnostics of the 
stellar populations of the galaxies. Both indices were cor- 
rected for the observed contributions of the emission lines 
in their bandpasses. From a library of 32,000 model star for- 
mation histories, the measured D4000 and H8a indices were 
used to obtain a maximum likelihood estimate of the z-band 
M*/L for each galaxy. By comparing the colour predicted by 
the best-fit model to the observed colour of the galaxy, the 
attenuation of the starlight due to dust could be estimated. 

The SDSS imaging data provide the basic structural pa- 
rameters that are used in this analysis. The z-band absolute 
magnitude, combined with the estimated values of M*/L 
and dust attenuation A z yield the stellar mass (M*). The 
half-light radius in the z-band and the stellar mass yield the 
effective stellar surface mass-density (/i* = M*/27rr 2 0iZ , in 
unit of h 2 A/o/kpc 2 ). As a proxy for Hubble type we use 
the SDSS "concentration" parameter C, which is defined as 
the ratio of the radii enclosing 90% and 50% of the galaxy 
light in the r band (see Stoughton et al. 2002). Strateva et 
al. (2001) find that galaxies with C > 2.6 are mostly early- 
type galaxies, whereas spirals and irregulars have 2.0 < C < 
2.6. 

The reader is referred to Kauffmann et al. (2003a) for a 
more detailed description of the methodology used to derive 
the stellar masses used in this paper. An analysis of how the 
physical properties of galaxies correlate with mass is pre- 
sented in Kauffmann et al. (2003b). All the parameters used 
in this paper are available publically at http://www.mpa- 
garching.mpg.de/SDSS/ (see also Brinchmann et al. 2004). 

2.3 Sample selection 

In this paper, all the three regions in NYU-VAGC, i.e. 
NGCE, NGCO and SGC, are considered. Statistics are mea- 
sured separately for the three regions but the results are 



4 C. Li et al. 

Table 1. Flux-limited samples selected according to luminosity/stellar mass 



Number, of Galaxies 



Percentage in Subsamples" 



Sample 


Mo.i r 


SGC 


NGCO 


NGCE 


Total 


g-r 


D4000 


C 


logio ^* 


LI 


[-17.0, -16.0) 


458 


548 


608 


1614 


27.2% 


41.7% 


90.7% 


48.0% 


L2 


[-17.5, -16.5) 


735 


1261 


1115 


3111 


26.2% 


32.2% 


82.9% 


43.4% 


L3 


[-18.0, -17.0) 


1257 


2301 


1695 


5253 


27.3% 


27.6% 


74.2% 


41.4% 


L4 


[-18.5, -17.5) 


2130 


3808 


2693 


8631 


31.0% 


26.3% 


67.0% 


44.2% 


L5 


[-19.0, -18.0) 


3657 


6391 


4366 


14414 


36.6% 


29.3% 


60.5% 


49.3% 


L6 


[-19.5, -18.5) 


6532 


10754 


8582 


25868 


43.9% 


36.4% 


55.9% 


56.3% 


L7 


[-20.0, -19.0) 


10349 


16788 


15740 


42877 


49.1% 


42.6% 


54.3% 


61.5% 


L8 


[-20.5, -19.5) 


14804 


24688 


22879 


62371 


51.9% 


46.9% 


54.9% 


63.7% 


L9 


[-21.0, -20.0) 


18460 


31997 


27530 


77987 


52.9% 


51.2% 


56.6% 


61.9% 


L10 


[-21.5, -20.5) 


17717 


31010 


25376 


74103 


53.4% 


56.4% 


58.9% 


54.7% 


Lll 


[-22.0,-21.0) 


12140 


20647 


16252 


49039 


55.9% 


62.7% 


64.3% 


41.1% 


L12 


[-22.5,-21.5) 


5384 


8895 


6876 


21155 


61.1% 


70.5% 


72.6% 


23.3% 


L13 


[-23.0, -22.0) 


1267 


2097 


1674 


5038 


64.8% 


78.0% 


76.3% 


7.20% 


Sample 


log 10 


















Ml 


[9.0,9.5) 


1686 


3230 


2325 


7241 


14.7% 


14.5% 


55.8% 


26.3% 


M2 


[9.5, 10.0) 


4086 


6695 


5237 


16018 


23.8% 


19.8% 


44.3% 


36.5% 


M3 


[10.0, 10.5) 


9757 


15528 


14275 


39560 


43.4% 


38.8% 


49.5% 


56.3% 


M4 


[10.5, 11.0) 


17340 


30519 


26423 


74282 


55.3% 


53.1% 


58.9% 


63.8% 


M5 


[11.0, 11.5) 


12475 


21213 


16671 


50359 


65.5% 


70.0% 


70.5% 


51.7% 


M6 


[11.5, 12.0) 


1183 


2082 


1603 


4868 


71.7% 


83.7% 


78.1% 


17.3% 



Percentage of objects in subsample with larger value of physical quantities. 



Table 2. Volume-limited samples 



Number of Galaxies 
Sample M .i r z SGC NGCO NGCE Total 



VL1 [-18.0,-17.0) (0.01,0.03) 675 1011 1016 2702 

VL2 [-19.0 - 18.0) (0.02,0.04) 986 1776 1433 4195 

VL3 [-20.0,-19.0) (0.03,0.07) 4510 7135 4514 16159 

VL4 [-21.0,-20.0) (0.04,0.07) 2202 3275 2076 7553 

VL5 [-21.0,-20.0) (0.04,0.10) 6886 10021 10772 27679 

VL6 [-22.0,-21.0) (0.07,0.16) 5566 9446 8335 23347 

VL7 [-23.0,-22.0) (0.10,0.23) 705 1146 874 2725 



Sample log 10 M„ 2 SGC NGCO NGCE Total 



VM1 [9.0,9.5) (0.015,0.045) 1195 2190 1624 5009 

VM2 [9.5,10.0) (0.020,0.075) 3368 5622 4020 13010 

VM3 [10.0,10.5) (0.025,0.100) 8039 12530 11865 32434 

VM4 [10.5,11.0) (0.040,0.140) 14324 24403 22035 60762 

VM5 [11.0,11.5) (0.070,0.200) 10343 17393 14065 41801 



always presented for the whole survey by combining the re- 
sults in these regions. 

We first select all NYU-VAGC galaxies with extinction 
corrected Petrosian magnitude 14.5 < r < 17.77. The bright 
limit is so chosen because the SDSS becomes incomplete for 
bright galaxies with large angular size, whereas the faint 
limit corresponds to the magnitude limit of the Main galaxy 
sample in SDSS. Further criteria for galaxies to be included 
in our analysis are, 1) they are identified as galaxies from the 
Main sample (see Blanton et al. 2005a for a detailed descrip- 
tion), 2) they lie within the redshift range 0.01 < z < 0.3 
and the absolute magnitude range —23 < Mo.i r < —16. 
Here Mo.i r is the r-band absolute magnitude corrected to 
its z = 0.1 value using the K— correction code (kcorrect 



v3_lb) of Blanton et al. (2003a) and the luminosity evolu- 
tion model of Blanton et al. (2003c). Our resulting sample 
includes a total of 196238 galaxies. 

The galaxies are then divided into a variety of different 
subsamples. We create 13 subsamples according to absolute 
magnitude, ranging from Mo.i r = —16 to Mo.i r = —23. 
Each sample includes galaxies in an absolute magnitude in- 
terval of 1 magnitude, with successive subsamples overlap- 
ping by 0.5 magnitude. Details are given in Table 1 (Samples 
L1-L13). 

Similarly, the galaxies are divided into 6 subsamples ac- 
cording to log 10 M* (M1-M6 in Table 1). We do not consider 
galaxies with log 10 M* < 9, because the volume of the survey 
over which such systems can be detected is extremely small. 
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Table 3. Flux-limited samples selected according to physical 
quantities 



Number of Galaxies 



Sample 


g-r 


SGC 


NGCO 


NGCE 


Total 


Cl 


[U.Z, U.O j 


Zo4o 


OD /4 


OQCQ 
0000 


y4U0 


cz 


[U.O, U.O ) 


( yo 


y511 


Q A 1 1 

8411 




CO 


[U.4, U. / J 


yoou 


15 / lo 


1 'id A 1 
lo04 / 


0071 n 
00 1 1U 


c4 


[0.5, 0.8) 


11464 


19502 


16925 


47891 


CO 


[u.d, u.y ) 


14ZU0 


Zo / 1 / 


ZloOO 


oyzoo 


c6 


[0.7,1.0) 


16557 


28504 


25476 


70537 


c7 


[0.8,1.1) 


13188 


22679 


20229 


56096 




[u.y, i.zj 


oy 1 y 


1 9(^1 7 
1ZD1 ( 


1 071 A 


QAQ1 A 
OUOIU 


Sample 


D4000 


SGC 


NGCO 


NGCE 


Total 


Til 

Ul 


[1 n 1 q\ 
[l.U, 1.0) 


OoU ( 


Oz45 


C^Q AO 

5ouy 


15oOl 


no 

Dz 


[1.1, 1.4) 


7755 


12868 


11050 


31673 


Do 


[l.Z, 1.5J 


lUooo 


1 ( 505 


15U05 


A OOfiQ 

4zyoo 


D4 


[1.3, 1.6) 


9900 


17010 


14561 


A 1 /I71 

414 ( 1 


JJ5 


[1.4, 1.7) 


8281 


14296 


12192 


Q A TC.C\ 

o4/by 


Do 


[1.5, 1.8) 


7762 


13298 


11527 


32587 


D7 


[1.6,1.9) 


9090 


15716 


13874 


38680 


D8 


[1.7,2.0) 


9902 


17143 


15622 


42667 


HQ 


ri q 9 

[l.o, Z. 1 J 


oU4(5 


10 ( 1 D 


IZoUO 




D10 


[1.9,2.2) 


4274 


7044 


6916 


18234 


Dll 


[2.0,2.3) 


1108 


1654 


1766 


4528 


Sample 


C 


SGC 


NGCO 


NGCE 


Total 


L/l 


[1.0, z.i) 


OO 1 Z 


OO f / 


4/0O 


1 QQQO 

loooZ 


y^z 


[1 7 o\ 
[1. 1 , Z.oJ 






1U (Uo 


0U00O 


C3 


[1.9, 2.5) 


11027 


19257 


16399 


46683 


C4 


[2.1,2.7) 


12504 


22439 


19232 


54175 


C5 


[2.3,2.9) 


13007 


23332 


20616 


56955 


C6 


[2.5,3.1) 


12841 


22119 


19967 


54927 


C7 


[2.7,3.3) 


10620 


17116 


15881 


43617 


C8 


[2.9,3.5) 


6533 


9827 


9227 


25587 


C9 


[3.1,3.7) 


2647 


3752 


3580 


9979 


Sample 


log 10 ^ 


SGC 


NGCO 


NGCE 


Total 


Ml 


[8.00,8.50) 


2006 


3415 


2818 


8239 


ii2 


[8.25,8.75) 


5430 


9447 


7989 


22866 


fi3 


[8.50, 9.00) 


9960 


17419 


15139 


42518 


/'I 


[8.75,9.25) 


14166 


24865 


21974 


61005 


/'"• 


[9.00,9.50) 


13593 


23073 


20834 


57500 


/16 


[9.25,9.75) 


7015 


10908 


10044 


27967 


M7 


[9.50,10.0) 


1495 


2065 


1819 


5379 



This is illustrated in Fig.l, where we plot the maximum red- 
shift out to which a galaxy of mass M» with maximal M,/L 
would be detected in the survey. This calculation assumes 
a 13 Gyr single-age stellar population and is based on the 
Bruzual & Chariot (2003) models. At stellar masses below 
10 9 Mq, the oldest galaxies are only visible at z < 0.03. For 
galaxies with masses less than 1O 8 M0 the maximum redshift 
is well below 0.02. 

To compare our results to previous work, we have also 
constructed volume- limited subsamples (see Table 2), in- 
cluding subsamples that are volume-limited in luminosity 
(Sample VL1-VL7) and in stellar mass (Sample VM1-VM5). 
The absolute magnitude ranges and redshift ranges used for 
selecting subsamples VL1-VL7 are the same as in Z05. 

As will be described in Section 4.4, we further di- 



vide each luminosity and stellar mass subsample into red 
and blue, high D4000 and low D4000, low concentration and 
high concentration, low density and high density subsam- 
ples by fitting the distributions of these parameters using 
bi-Gaussian functions. These subsamples are also listed in 
Table 1. It is also interesting to investigate how clustering 
varies as a function of colour/D4ooo/concentration/surfacc 
density at fixed stellar mass. To this end, we select a sam- 
ple of galaxies with stellar masses in the range of 10 < 
log 10 M* < 11, and divide the galaxies into subsamples ac- 
cording to their g — r colours (Sample cl-c8), D4000 val- 
ues (Sample D1-D12), concentrations (Sample C1-C10) and 
surface mass densities (Sample /il-/i6). The details of these 
subsamples are given in Table 3. 



3 CLUSTERING MEASURES 

In this section, we outline our method for measuring the 
galaxy two-point correlation function for a flux-limited sam- 
ple of galaxies. We begin by describing our methods for con- 
structing random samples. We then describe how we correct 
for the effect of fibre collisions. Finally, we describe the 2PCF 
estimator and how measurement errors are calculated. 



3.1 Constructing Random samples 

In order to use galaxy surveys in a statistically meaningful 
way, we need to have complete knowlege of their selection 
effects. A detailed account of the observational selection ef- 
fects accompanies the NYU-VAGC release. The survey ge- 
ometry is expressed as a set of disjoint convex spherical 
polygons, denned by a set of "caps". This methodology was 
developed by Andrew Hamilton to deal accurately and ef- 
ficiently with the complex angular masks of galaxy surveys 
(Hamilton & Tegmark 2002). 2 The advantage of using this 
method is that it is easy to determine whether a point is 
inside or outside a given polygon (Tegmark, Hamilton & Xu 
2002). The redshift sampling completeness is then defined 
as the number of galaxies with redshifts divided by the total 
number of spectroscopic targets in the polygon. The com- 
pleteness is thus a dimensionless number between and 1, 
and it is constant within each of the polygons. The limit- 
ing magnitude in each polygon is also provided (it changes 
slightly across the survey region) . 

We have constructed separate random catalogues for 
each of the three regions of sky. These catalogues are de- 
signed to include all observational selection effects and are 
constructed as follows. First, we select a spatial volume that 
is sufficiently large to contain the survey sample. Then we 
randomly distribute points within the volume and eliminate 
the points that are outside the survey boundary. Adopting 
the same magnitude limits as in the observational sample, 
we select random galaxies and we use the luminosity func- 
tion derived by Blanton et al. (2003c) to assign to each of 
these galaxies an apparent and an absolute magnitude (ap- 
propriately K and i?-corrected, see §2.3). 

Since we will estimate the correlation function as a func- 
tion of stellar mass, we also need to assign a mass to each 
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point in the random sample. One way to do this is to use 
the observed relation between luminosity and stellar mass 
derived directly from our sample. The black lines in the top 
panel of Fig. 2 show contours of the number density of galax- 
ies in the plane of absolute magnitude vs stellar mass. It can 
be seen from the histograms in the bottom panel of this fig- 
ure that at fixed luminosity, the distribution of the stellar 
mass of galaxies is well described by a Gaussian, with the 
width of the Gaussian decreasing at higher luminosities. 

We have divided the galaxies in our sample into 282 
subsamples separated by 0.03 mag in Mo.i r . The bin size 
was chosen so that each subsample contained at least 500 
galaxies. The stellar mass distribution in each subsample 
is fitted with a Gaussian and the solid lines in the bottom 
panel of Fig. 2 show examples of these fits for several lumi- 
nosity intervals. To test the quality of the fits, we randomly 
assign each galaxy a new stellar mass using the Gaussian 
fits. The red lines in the top panel of Fig. 2 show contours 
of the number density distribution that is predicted by this 
parametrization. The recovered distribution is a good match 
to the observations except in the region corresponding to lu- 
minous galaxies with low M*/Ls, where the method tends 
to overpredict the masses. 

We now introduce a more general method, which should 
still be applicable even when the relation between galaxy lu- 
minosity and the physical property under investigation is not 
well fit by a Gaussian and is subject to redshift- dependent 
selection biases 3 . Our method takes the observed sample 
and randomly re-assigns the position of each galaxy on the 
sky, while keeping the redshift, absolute magnitude, stellar 
mass, and any other physical quantities fixed. The spectro- 
scopic incompleteness at each sky position is imposed for the 
random points as in the observed sample. To get a random 
catalogue as large as possible, we repeat the above proce- 
dure for 20 times using different random number seeds. In 
this way, all possible redshift- dependent selection biases are 
automatically taken into account, and it is only the sky posi- 
tion that is randomized. This method is valid only when the 
sample is a wide-angle survey and the variation of its limiting 
magnitudes is small across the survey region, both of which 
are valid in the SDSS. For very large-area surveys such as 
the SDSS, randomizing the sky positions should be sufficient 
to break the coherence of the large scale structures in the 
survey. In the next section we will use random catalogues 
constructed using both methods and we will show that the 
measured projected correlation functions are in good agree- 
ment (see Fig. 3). 



3.2 Volume Corrections 

When computing correlation functions as a function of stel- 
lar mass, it is important to note that at a given stellar 
mass M* , galaxies with lower M* /L will be detected out to 
higher redshifts. A mass-selected sample will thus be biased 
to galaxies with younger populations and this may lead to 



3 One example of such a property would be the emission line 
luminosity of a central AGN. The line detection limit is a strong 
function of redshift, because increasing contamination by light 
from the surrounding host galaxy makes extraction of weak lines 
more difficult for more distant AGN. 



systematic errors when computing the correlation function 
at fixed M„. In this paper, we correct for this M*/L bias 
by computing a weighted correlation function: each galaxy 
pair is weighted by the inverse of the volume over which 
both galaxies can be detected in the survey. This is similar 
to the 1/Vmax correction that one makes when computing 
a mass function or luminosity function. The same volume 
weighting must also be applied to the random catalogue. It 
is very simple to apply the same technique to the catalogues 
constructed by randomizing the sky positions, so this will 
be our method of choice when estimating correlations as a 
function of stellar mass. 

In order to compute the volumes over which galaxies 
can be detected, we have computed z m in and z m ax for each 
galaxy in the sample, where z m in is defined as the redshift 
where the galaxy has an r-band magnitude of 14.5 and z max 
is the redshift where the galaxy has an r-band magnitude of 
17.77. These are derived using the kcorrect code of Blanton 
et al. (2003c). 



3.3 Correction for fibre collisions 

In the SDSS survey, two galaxies closer than 55" (corre- 
sponding to ~ 100/i _1 kpc at the median redshift of our sam- 
ple) cannot be assigned fibres simultaneously on one spec- 
troscopic plate. If these fibre "collisions" are not taken into 
account, the real-space (or projected) 2PCF will be system- 
atically underestimated at small separations. In earlier work 
(e.g. Zehavi et al. 2002, Tegmark et al. 2004), a correction 
was made by simply assigning to each galaxy affected by a 
collision the same redshift as its nearest spectroscopically- 
targeted neighbour on the sky. Zehavi et al. (2002) have 
performed extensive tests of this procedure and have shown 
that it works well for r p > 0.1/i _1 Mpc. Tegmark et al. (2004) 
also find no evidence that fibre collisions are boosting their 
measured power spectrum on the smallest scales they probe 
(k ~ 0.3 h Mpc- 1 ). 

In this paper, we use a different method for correcting 
for fibre collisions. We measure the angular 2PCF both for 
the spectroscopic samples and for the parent photometric 
sample from which they were drawn; the effect of fibre colli- 
sions can then be estimated and corrected for by comparing 
the two correlation functions. A similar method has been 
used in 2dF clustering analyses by Hawkins et al. (2003). 

Here we briefly sumarize our method, which will be de- 
scribed in more detail in a separate paper (Li et al., in prepa- 
ration). We calculate the angular 2PCF for the photometric 
sample (w p (0)) and for the spectroscopic sample (w z {0)). 
The quantity 



F{0) 



w z (0) + l 



(1) 



w p (6) + 1 ' 

can then be used to account for the effect of fibre collisions. 
For each data-data pair, we calculate the angular distance 
between the two members of the pair and weight this pair by 
1/F(0) when estimating the pair counts. If this correction is 
not applied, both w(8) and w p (r p ) exhibit a strong "rollover" 
in amplitude on small scales. Once the correction is applied, 
this feature disappears. In the rest of our analysis, we will 
always include the 1/F(0) weighting in the measurements of 
the correlation functions. Since the effect of fiber collisions 
is expected to be independent of galaxy property, we will 
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Figure 3. Projected 2PCF w p (r p ) in different luminosity intervals (Sample L3, L5, L7, L9, Lll and L13 in Table 1). When measuring 
2PCFs, two methods are used to construct random samples (see §3.1). The black lines are for the standard method and the red lines are 
for the method in which the sky positions are randomized (see the text for a detailed description). In each panel, the blue line is the line 
corresponding to £(r) = (r /5h~ 1 Mpc)~ 1 ' S . 



not derive the correction function F(0) for each individual 
galaxy sample, but choose to derive it from the whole sample 
and then apply it to our subsamples. 

3.4 Estimator of the Correlation Function and 
errors 

In this paper, the 2PCFs are measured in equal logarithmic 
bins of r p and in equal linear bins of tv, using the Hamilton 
(1993) estimator, 



4DD(r p ,ir)RR(r py ir) 



1. 



(2) 



[DR(r p ,nW 

Here r p and tt are the separations perpendicular and paral- 
lel to the line of sight; DD(r p ,n) is the count of data-data 
pairs with perpendicular separations in the bins log 10 r p ± 
0.5A log 10 r p and with radial separations in the bins n ± 
0.5A-7T; RR(r p ,n) and DR(r p ,n) are the counts of random- 
random and data-random pairs, respectively. The reason 
why we choose different bins for r p and tt is the fact that 
£(r p ,7r) decreases rapidly as a function of r p , but remains 
constant as a function of tt on small scales. Following stan- 
dard practice, we estimate the projected two-point correla- 
tion function w p (r p ) by, 



w p (r p ) = 2 I (,(r p ,iy)dir = 2 N £(r p , m) Am 
Jo 



(3) 



Here the summation for computing w p (r p ) runs from tti = 
0.5 /i _1 Mpc to 7T40 = 39.5 /i _1 Mpc, with Am = 1 /i _1 Mpc. 
The projected correlation function w p (r p ) is directly related 



to the real-space CF £(r) by a simple Abel transform of £(r). 
Commonly, w p (r p ) is modelled by a power law 

w{r p ) = Ar p -\ (4) 

Then £(r) is also a power law 

«r) = (ro/ry (5) 



with 



Ar( 7 /2) 



(6) 



r(i/2)r[( 7 -i)/2]' 

where T(x) is the Gamma function. However, the 
parametrization of the correlation function using only ro 
and 7 does not provide sufficient information to recover 
the full observational results unless the correlation function 
is a pure power law on all scales. Our results (see below) 
show that this is not the case. The departures of w(r p ) 
from a pure power law have also been discussed in pre- 
vious papers (e.g. Zehavi et al. 2004). We have thus cho- 
sen to present our results in terms of the measured ampli- 
tude of w p (r p ) on different physical scales. We also tabu- 
late the correlation functions so that our readers can re- 
cover them accurately. A detailed description of these tables 
(Tables 5 and 6) is given in the Appendix. The tables them- 
selves are available in electronic form at http://www.mpa- 
garching.mpg.de/~ leech / papers / clustering/. 

The errors on the clustering measurements are esti- 
mated using the bootstrap resampling technique (Barrow, 
Bhavsar, & Sonoda 1984). We generate 100 bootstrap sam- 
ples from the observations and compute the correlation func- 
tions for each sample using the weighting scheme (but not 
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Figure 4. Amplitude of the projected 2PCF w p (r p ) as a function 
of luminosity (Samples L1-L13) at r p =0.2, 1, 5, and 10 h^ 1 Mpc. 



the approximate formula) given by Mo, Jing, & Borner 
(1992). The errors are then given by the scatter of the mea- 
surements among these bootstrap samples. The tests in Jing, 
Mo & Borner (1998) using mock samples showed that the 
bootstrap errors are comparable (within a factor of 2) to the 
scatter among different mock samples, thus proving that the 
error estimates are robust. 



4 DEPENDENCE OF CLUSTERING ON 
GALAXY PROPERTIES 

4.1 Luminosity 

Fig. 3 shows the projected 2PCF w p (r p ) in different lumi- 
nosity intervals (Samples L3, L5, L7, L9, Lll and L13 in 
Table 1). The red and black lines on the figure compare the 
results obtained for the two different methods of construct- 
ing random samples described in Section 3.1. Black lines are 
for the "standard" method in which the selection function 
is explicitly modelled. Red lines are for the method in which 
the sky positions of the observed galaxies are randomly re- 
assigned. The agreement between the two methods is very 
encouraging, suggesting that the latter method does work 
well for analyzing large redshift surveys like the SDSS and 
can be applied in the case of more complicated selection by 
physical parameters with redshift- dependent biases. 

To guide the eye, we have plotted the relation £(r) = 
(r/5/i _1 Mpc) -1 - 8 in blue in every panel in Fig. 3. In general, 
we see that the amplitude of the correlation function in- 
creases with luminosity, but the strength of this effect is 
different on different scales. For galaxies fainter than L* 
(Mo.i r = —20.44), the clustering amplitude stays nearly 
constant on very small scales (r p ~ 0.1 /i _1 Mpc), but on 
larger scales there is a much stronger luminosity depen- 
dence. For bright galaxies, the correlation amplitude in- 
creases strongly with luminosity at all scales. It is also in- 
teresting that the slope of the correlation function gets flat- 
ter with increasing luminosity for galaxies fainter than L* , 
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Figure 6. Top panel: Relative bias factors for luminosity sub- 
samples (Samples L1-L13). Bias factors are denned by the rela- 
tive amplitude of the w p (r p ) estimates at a fixed separation of 
r p = 2.7/i _1 Mpc and are normalized by the —21 < Mo.i r < —20 
sample (Sample L9, L rj L*). The dashed curve is a fit to 
w p (r p ) measurments in the 2dF survey b/b* = 0.85 + 0.15L/L* 
(Norberg et al. 2001), and the long dashed curve is a fit ob- 
tained from measurements of the SDSS power spectrum, b/b* = 
0.85 + 0.15L/L* -0.04(M-M*) (Tegmark et al. 2004; Note that 
here the symboles M and M* are for absolute magnitudes, but 
not for stellar mass.). The triangles are obtained from the w p (r p ) 
measurments of Zehavi et al. (2005). Bottom panel: Relative bias 
factors for stellar mass subsamples (Sample M1-M6). Bias fac- 
tors are normalized by Sample M3, where the mean stellar mass 
is close to the characteristic stellar mass of the Schechter mass 
function. 



but then increases for galaxies brighter than L* . In another 
word, L* galaxies exhibit the flattest correlation functions. 

These trends are illustrated more clearly in Fig. 4, where 
we plot the amplitude of the projected correlation function 
w p (r p ) as a function of luminosity at r p =0.2, 1, 5, and 
10 /i _1 Mpc. At r p — 0.2 /i _1 Mpc, the correlation function 
probes galaxy pairs that reside within a common dark mat- 
ter halo. At r p — 10 /i _1 Mpc the correlation function should 
only be sensitive to pairs of galaxies in separate haloes. 
This figure confirms that luminous galaxies cluster more 
strongly than faint galaxies, with the difference becoming 
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Figure 5. Comparison of our w p (r p ) measurments with Zehavi et al. (2005, red). The black and green are respectively for volume-limited 
and magnitude-limited samples. The blue line in the left-bottom panel is for the volume-limited sample with the redshift threshold 
reduced from 0.10 to 0.07. See the text for a more detailed description. 



more marked above L* . However, the luminosity dependence 
of galaxy clustering is different on different scales. On small 
scales, the clustering amplitude does not vary with luminos- 
ity for galaxies fainter than L* , but increases steeply for 
galaxies brighter than L* . In contrast, the amplitude on 
large scales rises more continuously as a function of lumi- 
nosity. It is also interesting that the dependence of w p (r p ) 
on luminosity appears to change slope at Mo.i r ~ —20. One 
possible reason for this switch in behaviour is that a signifi- 
cant fraction of faint galaxies are "satellite" systems orbiting 
within a common dark matter halo, whereas bright galaxies 
are mainly "central" galaxies located at the centers of their 
dark matter haloes. We intend to explore this in more detail 
in future work. 

To compare our results to previous studies, we have 
also computed correlation functions using samples that are 
volume-limited in luminosity (Samples VL1-VL7). The re- 
sults are shown in Fig. 5. Black lines show the correlation 
functions for samples VL1-VL7. For comparison, the mea- 
surements provided by Z05 are shown in red and the correla- 
tion functions computed from the corresponding magnitude- 
limited subsamples are shown in green. The agreement 
between the magnitude-limited analysis and the volume- 
limited one indicates that our results are robust and reliable. 
Furthermore, it can be seen that our measurements are in 
good agreement with those carried out by Z05, although 
there are some small differences. These are probably due to 
the different 2PCF estimators or the different methods of 
constructing random samples. We note that the magnitude- 
limited sample of galaxies with —20 < Mo.i r < — 19 (Sample 



L7) and the magnitude-limited and volume-limited samples 
with -21 < Ma.i r < -20 (Sample L9 and VL5) all ex- 
hibit anomalously high w p (r p ) values at large separations 
(r p ^ 5/i~ 1 Mpc). As pointed out by Z05, this anomalous be- 
havior is a "cosmic variance" effect caused by an enormous 
supercluster at z ~ 0.08, which overlaps these three sam- 
ples. When the Sample VL5 is restricted to redshifts below 
0.07, its projected correlation functions drops and steepens, 
(blue line in Fig. 5), coming into good agreement with that 
of Z05. 

Following Z05, we calculate the relative bias factor b/b* 
as a function of normalized luminosity L/L*. The relative 
bias factor is defined by the amplitude of w p (r p ) measured 
at a fixed separation r p = 2.7h~ 1 Mpc relative to the value 
measured for the —21 < Mo.i r < —20 subsample (Sam- 
ple L9, which has L m L*). This fiducial separation of 
2.7h~ 1 Mpc was chosen because it is well out of the very 
non-linear regime, but still small enough so that the corre- 
lation functions are very accurately measured in all surveys. 
The solid circles in Fig. 6 show our results and the triangles 
show the SDSS results from Z05. The long dashed curve is 
taken from Tegmark et al. (2004), where bias factors are de- 
rived from the galaxy power spectrum P(k) at wavelength 
2-k /k ~ 100 ft - 1 Mpc. The results in this paper and in Z05, 
both of which are derived from w p (r p ) measurements, agree 
very well (as they should). Our results are also in quite good 
agreement with those of Tegmark et al. The dashed curve 
in Fig. 6 shows the result of Norberg et al. (2001), based on 
w p (r p ) measurements of somewhat more luminous galaxies 
in the 2dF survey (log 10 L/L* ^ — 0.6). The agreement is 
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Figure 7. Projected 2PCF for stellar mass subsamples, as indicated. The red lines are for the results obtained by applying volume 
corrections, compared with those without applying the corrections (black). The blue lines in some panels are for the samples that are 
volume-limited in M*. In each panel, the green line is the line corresponding to £(r) = (r /5h~ 1 Mpc)~ 1, & . 
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Figure 8. Examples of the bimodal distribution of physical quan- 
tities in different luminosity intervals, as indicated. In each panel, 
the histogram is for the data, whileas the green and blue lines are 
the best fit Gaussians and the red is the total. No is the maximum 
of the total fit, and the quantities with a zero give the median of 
the two Gaussian centers. 

again very good over the range of luminosities where the 
different analyses overlap. 

4.2 Stellar mass 

In this section, we present measurements of the projected 
2PCF W p (r p ) as a function of stellar mass. As discussed in 
section 3.2, when computing w p (r p ) as a function of mass, 
we weight each galaxy pair by the inverse of the volume over 
which both galaxies can be detected in the survey. The effect 
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Figure 9. Contours of number density of galaxies in the planes 
of luminosity vs physical quantities. The black lines are for the 
data, whileas the red are reconstructed according to the best- 
fitting bi-Gaussians (see Fig. 8; also see the text for a detailed 
description). The blue lines are the best linear fits to the median 
Gaussian centers as a function of luminosity (see Fig. 8). These are 
the luminosity-dependent cuts that we adopt for dividing galaxies 
according to a given physical property. The green line in the top- 
left panel is the g — r cut adopted by Zehavi et al. (2005). 



of this correction can be seen in Fig. 7 by comparing the black 
lines (no volume- weighting) with the red lines (with volume- 
weighting). As can be seen, the volume correction steepens 
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the correlation function of low mass galaxies. Recall that if 
no volume correction is applied, the sample is biased towards 
galaxies with low M* / L ratios. As we will show in detail in 
the following section, the slope of the correlation function 
is very sensitive to the colour (and hence the young stellar 
content) of galaxies, particularly for low mass systems. This 
is why the volume corrections make the most difference for 
galaxies in our two lowest mass bins. We have also compared 
our results with the measurements obtained using samples 
that are volume- limited in stellar mass {blue lines). As can 
be seen, the results obtained for the volume-limited samples 
agree very well with the volume-corrected w p (r p ). 

The bottom panel of Fig. 6 shows the relative bias fac- 
tor b/b* at r v = 2.7 /i _1 Mpc as a function of stellar mass, 
with points showing the results from our w p (r p ) measure- 
ments based on samples M1-M6, and dashed lines showing 
the fit to the measurements b/b* = 0.90 + 0.10M/M*. The 
value M* is determined by fitting a Schechter function to 
the stellar mass function of the galaxies in our sample. We 
obtain M* = (4.11±0.02) x 10 10 hr 2 M , a = -1.073±0.003 
and <j>* = 0.0204 ± 0.0001/i 3 Mpc~ 3 (Wang et al. , in prepa- 
ration). Qualitatively, the behaviour of the relative bias as 
a function of M» is very similar to the results obtained as 
a function of L. This is not surprising, because luminosity 
and stellar mass are reasonably tightly correlated (see Fig. 
2). What is of interest, however, is that these measurements 
can be used to set constraints on the fraction of baryons 
that have been turned into stars in dark matter haloes of 
different mass. We will come back to this in future work. 



4.3 Division by physical parameters 

We now investigate how the clustering of galaxies of given 
luminosity (or stellar mass) depends on properties such as 
colour, 4000 A break strength, concentration and surface 
mass density. Z05 performed such an analysis in the space 
of luminosity vs g — r colour. They adopted a tilted colour 
cut motivated by the colour-magnitude diagram. A sim- 
ilar colour division is presented in Baldry et al. (2004), 
who found that the distribution of galaxy colour could be 
well approximate using bi-Gaussian functions (see Baldry et 
al. 2004, also see Fig. 8 here). Fig. 8 shows that other physi- 
cal quantities, such as D4000 , C and /i, also exhibit bimodal 
distributions. We thus fit bi-Gaussian functions to the dis- 
tribution of g — r, D4000, C and log/x* for each of the 282 
luminosity subsamples described in §3.1. These are shown in 
Fig. 8 for three representative luminosity intervals. In Fig. 9 
we illustrate how well these fits recover the true distribution 
of these parameters as a function of luminosity. Black lines 
show contours of the actual number density of galaxies and 
red lines show the predicted number densities from the bi- 
Gaussian fits. As can be seen, the bi-Gaussian model does a 
reasonable job of reproducing the observations. 

The division of the luminosity subsamples into red and 
blue, high D4000 and low D4000, high concentration and low 
concentration, high surface density and low surface density, 
is defined as the mean of the two Gaussian centers in each 
luminosity bin. In Fig. 9, triangles indicate the two Gaussian 
centers and the crosses are the mean of these centers. We fit 
the dividing point as a function of luminosity using a linear 
equation of the form (see Fig.9, blue lines), 



Table 4. Coefficients for the formula of dividers in physical quan- 
tities 



Quantity 


A 


B 


g-r 


-0.788 ± 0.028 


-0.078 ± 0.001 


D4000 


-0.563 ± 0.038 


-0.108 ± 0.002 


C 


-0.498 ± 0.193 


-0.150 ± 0.009 


logi M» 


3.738±0.213 


-0.256 ±0.011 



P = A + B ■ Mo.i r , (7) 

where P is the physical parameter under investigation, and 
A and B are the best-fitting linear coefficients. These are 
listed in Table 4 for reference. Using these best-fitting cuts 
(Eqn.7), we divide the galaxies in each of the 13 luminos- 
ity samples (Sample L1-L13) and the 5 stellar mass sam- 
ples (M1-M5) into two further subsamples. For simplicity, 
we use "red" to denote the subsamples with larger values 
of the physical quantity and "blue" for the subsamples with 
the smaller value. The percentage of galaxies in the "red" 
subsamples are listed in the last 4 columns of Table 1. 



4-3.1 In luminosity bins 

The projected 2PCFs in the space of luminosity vs colour, 
D4000, concentration and surface density are presented in 
Fig. 10. Red (blue) lines correspond to the "red" ("blue") 
subsamples. Black lines are for the sample as a whole. Fig. 11 
shows the measurements of the amplitude of w p (r p ) at r p — 
0.2, 1, 5 and 10 ft _1 Mpc. 

When the sample is divided by g — r colour, redder 
galaxies of all luminosities are more strongly clustered and 
have steeper correlation functions than their blue counter- 
parts. This colour dependence is much stronger for faint 
galaxies than for bright galaxies, particularly on small scales. 
Fig. 11 shows that the clustering amplitude of blue galax- 
ies increases as a function of luminosity at all scales. How- 
ever, the situation is more complicated for red galaxies. On 
small scales, faint red galaxies are clustered more strongly 
than bright red galaxies. On large scales, however, the trend 
reverses and the clustering amplitude increases with lumi- 
nosity. These results are all consistent with the findings of 
Z05. The behaviour of the slope of the correlation function 
as a function of luminosity is also different for red and blue 
galaxies. The correlation function of faint red galaxies is very 
steep and the slope flattens systematically as luminosity in- 
creases. In contrast, the slope of the correlation function of 
blue galaxies exhibits rather little change with luminosity. 
All these trends are qualitatively consistent with a picture 
in which faint red galaxies are primarily "satellite" systems 
in massive dark matter haloes, but faint blue galaxies oc- 
cupy haloes of smaller mass (Z05; Berlind et al. 2005; Li et 
al. 2006). 

Our results show that the dependence of clustering on 
D4000 is very similar to what is obtained for g — r colour. 
On the other hand, rather different results are obtained for 
the structural parameters C and /i*. Fig. 11 clearly shows 
that the dependence of w p (r p ) on g — r/D4ooo is considerably 
stronger than the dependence on C//i* at all physical scales. 
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r [hr'Mpc] r p [h~'Mpc] r p [h"'Mpc] r p [h"'Mpc] 

Figure 10. Projected correlation function w p (r p ) for galaxies in different luminosity intervals and with different properties (from left 
to right: g — r colour, D4000, concentration and log stellar surface mass density log 10 fi*). The panels in each colume are for different 
luminosity subsamples, with the range of absolute magnitude indicated in the left column. In each panel, the black is for the full sample, 
the red (blue) is for the subsample with larger(smaller) value of the corresponding physical parameter. In each panel, the green line is 
the line corresponding to £(r) = (r/5h~ -"-Mpc) -1,8 . 



4-3.2 In stellar mass bins 

The projected 2PCFs in the space of stellar mass vs the 
same set of physical parameters are presented in Fig. 12. The 
measurements at r p = 0.2, 1, 5 and 10 /i _1 Mpc are plotted 
in Fig. 13. 

Qualitatively, the results shown in Figs. 11 and 13 ap- 
pear very similar. However, careful comparison of these two 
figures shows that interesting quantitative differences do ex- 
ist between the clustering of the "red" and "blue" subsam- 
ples at fixed luminosity and at fixed stellar mass. On small 
scales, the dependences are stronger when evaluated at fixed 
mass, particularly for low mass galaxies. We also note that 
there is a small difference in the clustering amplitude of 
the "red" and "blue" subsamples at projected radii as large 
as 10 hT x Mpc in Fig. 11. This difference is seen both in 
g — r colour and D4000 and more weakly in the structural 
parameters C and fi,. Fig. 13 shows, however, that at fixed 
stellar mass there is no longer any significant difference in 
the clustering amplitude of high concentration and low con- 
centration galaxies or high surface density and low surface 
density galaxies on scales larger than 5 hT 1 Mpc. The clus- 
tering differences in g — r and D4000 do persist, however. 
This is a rather surprising result, because at scales larger 
than a few Mpc, galaxies inside the same dark matter halo 



no longer contribute to the clustering signal. Our result thus 
indicates that at fixed stellar mass, the clustering properties 
of the surrounding dark matter haloes are somehow corre- 
lated with the colour of the selected galaxies. 

To investigate this effect further, we have computed the 
2PCF as a function of g — r, D4000, C and [i* for galaxies 
spanning a narrow range in stellar mass (10 10 — W 11 Mq). 
In Fig. 14, we plot the amplitude of the correlation function 
as a function of these quantities measured on four different 
physical scales (r p = 0.2,1, 5 and 10 /i _1 Mpc). This figure 
confirms that the dependence of w p (r p ) on g — r/D4ooo ex- 
tends out to larger physical scales than the dependence of 
w p (r p ) onC//i». The figure also shows that the dependence 
of w p (r p ) on C and \x* is also qualitatively quite different 
on small scales. On scales less than < \K~ X Mpc, the am- 
plitude of the correlation function is constant for "young" 
galaxies with 1.1 <D4ooo < 1-5 and a steeply rising function 
of age for "older" galaxies with D4000 > 1-5. In contrast, the 
dependence of the amplitude of w p (r p ) on concentration is 
strongest for disk-dominated galaxies with C < 2.6 on these 
same scales. This demonstrates that different physical pro- 
cesses are required to explain environmental trends in star 
formation and in galaxy structure. 
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Figure 11. w p (r p ) measured at r p = 0.2, 1, 5, and 10 h~ 1 Mpc, as a function of luminosity. The different columns show the dependence 
on different physical quantities, as indicated. In each panel, the black is for the full sample, the red (blue) is for the subsample with 
larger(smaller) value of the corresponding physical parameter. 



5 SUMMARY AND DISCUSSION 

In this paper we present our determinations of the pro- 
jected two-point correlation function (2PCF) w p (r p ) for dif- 
ferent classes of galaxies in order to study the dependence 
of clustering on the physical properties of these systems. We 
use the New York University Value Added Catalog (NYU- 
VAGC) which is constructed from the the Sloan Digital Sky 
Survey Data Release Two (SDSS DR2). 

The conclusions of this paper can be summarized as 
follows: 

(i) We confirm previous findings that luminous galax- 
ies cluster more strongly than faint galaxies, with the dif- 
ference becoming larger for galaxies with L > L* , where 
L* is the characteristic luminosity of the Schechter (1976) 
function. The dependence of galaxy clustering on luminos- 
ity is different on different physical scales. On small scales 
(r p ~ 0.2 /i _1 Mpc), the correlation amplitude is almost con- 
stant for galaxies fainter than L* , but the amplitude in- 
creases sharply above L* . On large scales, the correlation 



amplitude increases more continuously as a function of lu- 
minosity. Around L* there appears to be a shoulder, with 
w p (r p ) increasing more steeply with L for higher-luminosity 
galaxies. Our results are in good agreement with previous 
studies of clustering as a function of luminosity in the SDSS. 

(ii) We present w p (r p ) as a function of stellar mass. In 
analogy with previous results obtained as a function of lu- 
minosity, we find that more massive galaxies cluster more 
strongly than less massive galaxies, with the difference in- 
creasing above the characteristic stellar mass M* of the 
Schechter mass function. 

(iii) When galaxies are divided according to their physical 
properties, we find that galaxies with redder colours, larger 
4000A break strengths, more concentrated structure, and 
higher surface mass densities cluster more strongly and have 
steeper correlation functions at all luminosities and masses. 
The differences in clustering strength are larger on small 
scales and for low-luminosity and less massive galaxies. 

(iv) We have found that the dependence of w p (r p ) on 
g — r or D4000 extends out to larger physical scales (r p > 
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Figure 12. Projected correlation function w p (r p ) for galaxies in different stellar mass intervals and with different physical properties 
(from left to right: g — r colour, D400O1 concentration and log stellar surface mass density log 10 fi*). The panels in each column are for 
different stellar mass subsamples, with the range of stellar mass indicated in the left column. In each panel, the black is for the full 
sample, the red (blue) is for the subsample with larger(smaller) value of the corresponding physical parameter, and the green line is the 
line corresponding to £(r) = (r /5h~ 1 'Mpc)~ ls . 



5 /i _1 Mpc) than the dependence of w p (r p ) on C or On 
small scales (~ 0.2 ft _1 Mpc), the behaviour of w p (r p ) as a 
function of g — r or D4000 and as a function of C are quali- 
tatively different. 

We have chosen not to express our results in terms of 
power- law fits to our w p (r p ) measurements. We have tabu- 
lated the measurements of our correlation functions so that 
they can be accurately recovered. As discussed by Z05, a 
single power-law is a poor description of the data and as 
we expand our exploration of physical parameter space, it is 
important not to place unnecessary restrictions on the way 
in which the observational results are described. In this pa- 
per, we have chosen to plot trends in clustering amplitude 
evaluated on a variety different physical scales. This leads to 
a number of interesting insights that have not received much 
attention up to now : (1) the dependence of the clustering 
amplitude (or equivalently, the relative bias factor) on lumi- 
nosity is qualitatively different on small scales and on large 
scales, (2) there is a different scale dependence in the ampli- 
tude of the correlation function for parameters that measure 
the star formation histories of galaxies and for parameters 
that measure galaxy structure, suggesting that the trends 
in star formation and in galaxy structure are governed by 
different physical processes. 



Finally, it is worth comparing our results with the many 
studies that have examined correlations between galaxy 
properties and the local environment. One of the most fun- 
damental correlations between the properties of galaxies in 
the local Universe is the so-called morphology-density rela- 
tion. Oemler (1974) and Dressier (1980) pioneered the quan- 
tification of this relation, showing that spheroidal systems 
reside preferentially in dense regions. Since the standard 
morphological classification scheme mixes elements that de- 
pend on the structure of a galaxy with elements related to 
its recent star formation history, it is by no means obvious 
that these two elements should depend on environment in 
the same way. 

Recent studies using large surveys such as the SDSS 
have revealed that galaxy colour is the galaxy property 
most predictive of the local environment (e.g. Blanton et 
al. 2005b; Kauffmann et al. 2004). Hogg et al. (2003) show 
that the local density increases strongly with luminosity for 
the brightest galaxies. For faint galaxies, local density is 
senstive mainly to color, with faint red galaxies occupying 
highest-density regions. Blanton et al. (2005b) found that 
at fixed luminosity and colour, density is not closely related 
to surface brightness or to the Sersic index (a quantity re- 
lated to galaxy structure), so that morphological properties 
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Figure 13. w p (r p ) measured at r p = 0.2, 1, 5, and 10 h~ '^Mpc, as a function of stellar mass. The different columns show the dependence 
on different physical quantities, as indicated. In each panel, the black is for the full sample, the red (blue) is for the subsample with 
larger(smaller) value of the corresponding physical parameter. 



of galaxies are less closely related to galaxy environment 
than their luminosities and star formation histories. Kauff- 
mann et al. (2004) obtained very similar results. They found 
that at fixed stellar mass both star formation and nuclear 
activity depend strongly on local density, while structural 
parameters such as size and concentration are almost inde- 
pendent of it. 

Our analyses of w p (r p ) as a function of luminosity, 
colour and structural parameters are consistent with these 
conclusions. The power of the w p (r p ) statistic is that it en- 
capsulates information about how galaxy properties depend 
on environment over a wide range of physical scales. Kauff- 
mann et al. (2004) found no evidence for a significant depen- 
dence of galaxy structure on local density. However, their lo- 
cal densities are calculated in a fixed aperture of 2 /i _1 Mpc, 
whereas our plots (see Fig. 14) show clearly that the depen- 
dence of structural parameters on environment becomes sig- 
nificant on scales that are smaller than this value. 

The other advantage of w p (r p ) is that it is can be very 
easily compared with the predictions of galaxy formation 



simulations. It probes the physical processes occurring inside 
individual dark matter haloes as well the masses of the dark 
matter haloes that host galaxies of given mass, luminosity, 
size, age and concentration, thus placing strong constraints 
on theoretical models. This will be the focus of future work. 
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APPENDIX A: DESCRIPTION OF ONLINE 
TABLES 

Tables 5 and 6 contain our w p (r p ) measurements for 
the galaxies with different luminosities/stellar masses. 
Results are also given as a function of the physi- 
cal quantities g — r, D4000, C and /i„. The tables 
are available in electronic form at http://www.mpa- 
garching. mpg. de/~ leech/papers/ clustering/. 

Tables 5 and 6 list the data points in Figs. 10 and 12 
respectively, i.e. the measured projected 2PCF w p (r p ) for 
galaxies in different luminosity or stellar mass intervals and 
with different properties. Table 5 consists of 13 separate 
parts corresponding to the 13 luminosity samples (Samples 
L1-L13 in Table 1). Likewise, Table 6 consists of 5 parts 
corresponding to the 5 stellar mass samples (Samples MI- 
MS in Table 1). To make the description clearer, we present 
here an abridged version for the first part in Table 5, listing 
only the first several rows for Sample LI. The first column 
is the projected separation r p in unit of ft _1 Mpc, ranging 
from ~ 0.1 fo _1 Mpc up to ~ 45 ft _1 Mpc. The other columns 
give the w p (r p ) measurements and errors for the full lumi- 
nosity/stellar mass sample (Column 2) and the "red" and 
"blue" subsamples divided by g — r (Columns 3-4), D4000 
(Columns 5-6), C (Columns 7-8) and log 10 /j* (Columns 9- 
10). Short straight lines denote the points that have no mea- 
surements, either because of low S/N or for any other reason. 



